Bayes Decision Rules and Confidence Measures for Statistical Machine Translation

نویسندگان

  • Nicola Ueffing
  • Hermann Ney
چکیده

In this paper, we re-visit the foundations of the statistical approach to machine translation and study two forms of the Bayes decision rule: the common rule for minimizing the number of string errors and a novel rule for minimizing the number of symbol errors. The Bayes decision rule for minimizing the number of string errors is widely used, but its justification is rarely questioned. We study the relationship between the Bayes decision rule, the underlying error measure, and word confidence measures for machine translation. The derived confidence measures are tested on the output of a state-ofthe-art statistical machine translation system. Experimental comparison with existing confidence measures is presented on a translation task consisting of technical manuals.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Error Measures and Bayes Decision Rules Revisited with Applications to POS Tagging

Starting from first principles, we re-visit the statistical approach and study two forms of the Bayes decision rule: the common rule for minimizing the number of string errors and a novel rule for minimizing the number of symbols errors. The Bayes decision rule for minimizing the number of string errors is widely used, e.g. in speech recognition, POS tagging and machine translation, but its jus...

متن کامل

Fluency Constraints for Minimum Bayes-Risk Decoding of Statistical Machine Translation Lattices

A novel and robust approach to improving statistical machine translation fluency is developed within a minimum Bayesrisk decoding framework. By segmenting translation lattices according to confidence measures over the maximum likelihood translation hypothesis we are able to focus on regions with potential translation errors. Hypothesis space constraints based on monolingual coverage are applied...

متن کامل

Training and Evaluating Error Minimization Decision Rules for Statistical Machine Translation

Decision rules that explicitly account for non-probabilistic evaluation metrics in machine translation typically require special training, often to estimate parameters in exponential models that govern the search space and the selection of candidate translations. While the traditional Maximum A Posteriori (MAP) decision rule can be optimized as a piecewise linear function in a greedy search of ...

متن کامل

Training and Evaluating Error Minimization Rules for Statistical Machine Translation

Decision rules that explicitly account for non-probabilistic evaluation metrics in machine translation typically require special training, often to estimate parameters in exponential models that govern the search space and the selection of candidate translations. While the traditional Maximum A Posteriori (MAP) decision rule can be optimized as a piecewise linear function in a greedy search of ...

متن کامل

Estimation of Confidence Measures for Machine Translation

Confidence Estimation has been extensively used in Speech Recognition and now it is also being applied in Statistical Machine Translation. Its basic goal is to estimate a confidence measure for each word in a given hypothesis, in order to locate those words, if any, that are likely to be incorrectly recognised or translated. It can be seen as a two-class pattern recognition problem in which eac...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004